A RECOGNITION METHOD AND THE SAME SYSTEM OF 
INGEGRATING VOCAL INPUT AND HANDWRITING INPUT 

The present application claims priority to Taiwan application No. 
0921 12571 entitled" A recognition method and the same system of 
integrating vocal input and handwriting input" filed on May 8, 2003. 

BACKGROUND OF THE INVENTION 

1. FIELD OF THE INVENTION 

The present invention relates to a recognition method and the 
same system, especially relates to a recognition method and the same 
system integrating vocal input and handwriting input. 

2. Related Art 

Up to now the communication method between human and 
machine is almost proceeded by means of key board » mouse * 
handwriting » vocal input » image, etc. Especially the handwriting and 
vocal input method is widely researched and developed for the 
familiarity of human being's communication model. However, the 
result of the above associate researches and developments fail to be 
applied to business application for the failure of promoting the 
recognition ability as well as the input efficiency. 

In the field of developing vocal input and handwriting input 
recognition technology, a lot of associate technologies have been 
disclosed on various technical documents, taking the vocal input 
recognition as an example, US patent No.5692097 discloses a method 



of recognizing character by means of syllable, and Taiwan Publish 
No.283744 also discloses an intellectual Mandarin syllabic input 
method. On the other hand, taking the handwriting input recognition 
as an example, US patent No. 6226403 discloses a character 
recognition method by a plurality of input characters, and US patent 
No.6275611 also discloses a method decomposing the input 
character - classifying the substructure and recognizing the 
substructure. By the above prior arts, it shows the vocal recognition 
technology and the handwriting recognition technology make great 
progress separately. 

However, all of the above technologies are dedicated to the 
improvement of algorithm » feature extracting of handwriting/vocal 
input - or building standard toward improvement of vocal/handwriting 
patterns. The above efforts only make little promotion toward the 
recognition rate. Therefore, a concept of integrating handwriting and 
vocal input to promote the recognition rate was proposed in the 
present invention. 

US Patent No.6285785 discloses a message recognition 
employing integrated speech and handwriting information. The 
present invention refers this patent as a prior art. This patent provides 
a method to give each word a different vocal or handwriting 
weight( a , fi ), for example, if for the purpose of making a word be 
recognized by vocal input easier, then the vocal weight a may be 
set higher and handwriting weight /5 may be set lower. On the 
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contrary, if for the purpose of making a word be recognized by 
handwriting input easier, then the vocal weight a may be set lower 
than the weight yS. 

When user wants to process the recognition of input message, 
the user can obtain two lists comprising a plurality of possible 
candidate alphanumeric symbols by vocal input and handwriting input, 
and then combine the two lists into a new list according to the 
weight( a , fi ), and determine a most similar alphanumeric symbol to 
promote the recognition rate effectively. 

Although the above method can effectively promote the 
recognition ability, it still exists many problems. First, because this 
recognition method needs to input complete vocal and handwriting 
data for each character in advance, so the recognition procedure 
becomes extreme complex and difficult. Second, especially for 
Oriental language such Chinese » Korean » or Japanese » not only the 
issue of complexity for complete handwriting input, but also existing 
the characteristic of one alphanumeric symbol having one syllable, so 
the occasions of one alphanumeric symbol corresponding to many 
different pronunciations or one pronunciation corresponding to many 
different alphanumeric symbols always happen. The above constrains 
make this prior art be improper for Oriental Language input system. 
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SUMMARY OF THE INVENTION 

Therefore, the object of the present invention is providing a 
recognition method and the same system suitable for oriental syllabic 
language and integrating vocal input as well as handwriting input to 
promote recognition effectively. 

In a preferred embodiment, the recognition system of the present 
invention integrating vocal input and handwriting input comprises a 
vocal input device x a handwriting input device » a vocal input 
similarity estimator - and a handwriting input similarity estimator. 

The vocal input device is used for receiving a vocal input having 
at least one alphanumeric symbol ' and converting the vocal input into 
a first signal. The handwriting device is used for receiving a 
handwriting input describing one feature of the object alphanumeric 
symbol, and converting the handwriting input into a second signal. A 
vocal similarity estimator is used for generating an alphanumeric 
symbol array having a plurality of candidate alphanumeric symbols 
corresponding to the object alphanumeric symbol according to the 
first signal. A handwriting similarity estimator is used for extracting 
the most coincidental candidate alphanumeric symbol from the 
alphanumeric symbol array according to the second signal. The 
feature of the object alphanumeric symbol is the radical of the object 
alphanumeric symbol. 

Base on the above structure, the method of the present invention 
integrates vocal input recognition and handwriting input recognition 
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comprising the steps of: First, receiving a syllabic vocal input signal 
of one object alphanumeric symbol; Second, recognizing the input 
vocal signal and generating an alphanumeric symbol array having a 
plurality of candidate alphanumeric symbols corresponding to the 
object alphanumeric symbol ' then receiving an handwriting input 
signal describing the feature of the object alphanumeric symbol '» 
finally, extracting the most coincidental candidate alphanumeric 
symbol from the alphanumeric symbol array according to the feature. 

Therefore, the present invention takes advantage of complement 
between vocal input and vocal input, especially by a complete vocal 
input of an alphanumeric symbol a part of handwriting input including 
radical structure. By this way, the present invention provides more 
sufficient information for characteristic recognition, and therefore 
promotes recognition rate effectively. 

The examples and illustrations embodying the present invention 
will be manifested by the descriptions of the following preferred 
embodiment in reference to the drawings attached therewith. 

For the further description, first of all, the present invention is 
designed for the language which each character only corresponds to 
one syllable, such as Chinese * Korean » Japanese, etc. This 
embodiment takes Chinese as example for further description, but it 
should not be constrained by this embodiment. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig.l is an illustration of a recognition system of integrating 
vocal input and handwriting input. 

Fig.2 is a flow diagram for describing the steps of the present 
invention integrating vocal input and handwriting input. 

Fig.3 is a flow diagram for describing the steps of the present 
invention integrating vocal input and handwriting input. 

Fig.4 is an illustration of a vocal database building up 
alphanumeric symbol array by the same pronunciation. 

Fig.5 is an illustration of tracing the radical of an object 
alphanumeric symbol by handwriting input. 

Fig.6 is an illustration of tracing the substructure of an object 
alphanumeric symbol by handwriting input. 

Fig.7 is an illustration of tracing the radical of the other different 
object alphanumeric symbol by handwriting input. 

Fig$ is an illustration of tracing the radical of another different 
object alphanumeric symbol by handwriting input. 
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DETAIL DESCRIPTION OF THE INVENTION 

Refer to Fig.l, in the preferred embodiment, the recognition 
system of the present invention integrates vocal input and handwriting 
input comprising a first input device 1 » a second input device 2 > a 
vocal pattern training device 3 » a handwriting pattern training device 
4 - a first feature extractor 5 » a second feature exactor 6 - a vocal-input 
similarity estimator 7 » and a handwriting-input similarity estimator 8. 

The first input device 1 is a vocal input apparatus, for example, 
it comprises a microphone or a transducer, and an AD converter ( does 
not show in the Fig.l ) connecting with the microphone, for receiving 
the vocal input from the user and convert the vocal input signal into a 
digital signal as the first signal SI. Of course, the vocal input signal 
can be sampled by predefining different frequencies, or proceed the 
input signal by means of FFT for the successive recognition steps. 

The second input device 2 is a handwriting input apparatus. For 
example, a touch panel or a pen panel can be used for writing by 
stylus or by hand. Of course, the second input device 2 also having an 
AD converter ( does not show in the Fig.l ) for sampling the 
handwriting structure and converting it into the second signal S2, for 
processing the successive recognition steps. 

Wherein the second signal S2 is a substructure rather than a 
complete handwriting input of one alphanumeric symbol. Because the 
general handwriting input device is designed to provide user for 
inputting in a predetermined time span, if user does not continue 
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handwriting inputting during the time span then the handwriting 
motion will be considered completed. Therefore the second signal S2 
means a stroke inputted during a predetermined time span. The stroke 
might be only a substructure » or a radical » or the overall of one 
alphanumeric symbol. 

Vocal pattern training device 3 is used for recognizing the first 
signal SI transmitted by the first input device 1 according to hidden 
Markov model technology, and building up personal vocal pattern. 
With respect to the further description of hidden Markov model, it has 
been disclosed in the associate prior arts such as US Patent 
No.6285785 or Taiwan Publish No.308666. 

Handwriting pattern training device 4 is used for recognizing the 
second signal S2 transmitted by the second input device 2, and 
building up personal vocal pattern. With respect to the building up 
method of the handwriting pattern, it is processed by utilizing the 
pattern recognition technology, it has been disclosed in the associate 
prior arts such as US Patent No.5982929. 

Besides, the present invention further comprises a vocal 
database 30 and a handwriting database 40. The vocal database 30 
stores a plurality of vocal patterns » associate Chinese 
vocabulary/phrases » and Chinese grammar rules, etc. For the 
convenience of processing the following recognition steps, the data of 
vocal database 30 is represented as Fig.4 ( top-down [fon] » [fon ✓ ] » 
[fon v ] ' [fon \ Separately ) , determined by the same pronunciation 
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and usage rate. In other word, each alphanumeric symbol array is 
constructed by many candidate alphanumeric symbols having the 
same pronunciation, and the position of the candidate alphanumeric 
symbol represent the usage rate, the position more left, the usage rate 
more frequent. On the other hand, the data of handwriting database is 
sorted by strokes of the object alphanumeric symbol and radical of the 
object alphanumeric symbol. With respect to the method of building 
up associate database by radical or strokes of the alphanumeric 
symbol has been disclosed in the associate prior arts such as US 
Patent No.65391 13. 

Therefore, arrowhead of dotted line in Fig.l shows the data flow 
direction under the training mode of the present invention. If the first 
input device 1 and the second input device 2 are utilized for input, 
then according to the first signal SI and the second signal S2, vocal 
pattern training device 3 and handwriting pattern training device 4 
build up personal vocal and handwriting pattern by utilizing the data 
stored in vocal database 30 and handwriting database 40, and then 
stores the personal voice data and handwriting data into vocal 
database 30 and handwriting database 40 separately, for accelerating 
the recognition procedure and promoting the recognition rate. 

Because the first feature extractor 5 connects with the first input 
device 1, so the first feature extractor 5 receives the first signal SI and 
extracts the first input vector VI from the first signal SI. The method 
of extracting the first input vector VI is, for example, sampling the 
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amplitude change of a certain frequency range, to obtain a plurality of 
feature vector that belong to different frequency ranges. In the same 
way, the second feature extractor 6 connects with the second input 
device 2, so the second feature extractor 6 receives the second signal 
S2 and extracts the second input vector V2 from the second signal S2, 
and generates a plurality of feature vector V2. 

The first similarity estimator 7 connects with vocal database 30 
and the first feature extractor 5. The second similarity estimator 8 
connects with handwriting database 40 and the second feature 
extractor 6. According to the vocal pattern of vocal database 30, the 
first similarity estimator 7 extracts possible alphanumeric symbol 
array or alphanumeric symbol from the vocal database 30 by the first 
signal SI. Owing to building up the vocal pattern, user can effectively 
bypass the valueless data by saving the search time of the first 
similarity estimator 7 toward the vocal database 30. 

In the same way, the second similarity estimator 8 extracts 
possible alphanumeric symbol array or alphanumeric symbol from the 
handwriting database 40 by the second signal S2. Besides, the first 
similarity estimator 7 and the second similarity estimator 8 connect 
each other, therefore, for example, when the first similarity estimator 
7 determines a alphanumeric symbol array from vocal database 30 
according to the vocal input of user, according to the handwriting 
input, the second similarity estimator 8 can also assign a coincidental 
alphanumeric symbol from the alphanumeric symbol array determined 
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by the first similarity estimator 7. 

Finally, the candidate alphanumeric symbol determined by the 
first similarity estimator 7 and the second similarity estimator 8 
transmitted to the application program, such as Microsoft Word, and 
shown on display 50. Of course, except the first input device 1 and the 
second input device 2, the functions of the other devices are compiled 
by programming codes, executed by computer. The data for use is 
build in vocal database 30 and handwriting database 40 in advance. 

Therefore, base on the above structure and as shown on Fig.2, 
the present invention of the recognition method integrating vocal and 
handwriting input is shown as step 21 and 22, first, receiving a first 
input. The first input device 1 is utilized to receive the vocal input and 
convert it to the first signal SI. For example, if user wants to input a 
word as Fig. 5, he can do a vocal input by pronouncing r fon / jas the 
first input. The first input can be recognized by the first feature 
extractor 5 and the first similarity estimator 7, and then extract the 
corresponding data from the vocal database 30 to generate a 
alphanumeric symbol array coinciding with the first input. The 
extracted alphanumeric symbol array in this example is shown as 
Fig.4, the candidate alphanumeric symbols of the alphanumeric 
symbol array are sorted according to the usage rate. 

As step 23 shows, a time span can be predetermined on the basis 
of user's predefinition or default value of program, such as 2 seconds, 
and the system detect whether the second input exists or not during 



the predetermined time span. 

During this time span, if the user utilizes the second input device 
2 for inputting the feature presenting the characteristic of Fig.5, then 
as step 24 shows, extracting one recognition character corresponding 
to the second input from the alphanumeric symbol array. In this 
embodiment, the inputted feature of the alphanumeric symbol 
represents the radical of the alphanumeric symbol. Therefore, as Fig.5 
shows, user may input the radical of the alphanumeric symbol ( shown 
at left side as filled type ) . 

After extracting the alphanumeric symbol array corresponding 
to the pronunciation [fon / ] by the first similarity estimator 7, and 
then utilizing the pattern recognition technology of the second 
similarity estimator 8 to search the alphanumeric symbol with similar 
shape or radical from above alphanumeric symbol array. By this 
procedure, it is obvious to know that the alphanumeric symbol of 
Fig.5 with radical at the left side of Fig.5 is the most coincidental 
alphanumeric symbol matching the limitation of the second input. As 
step 25 shows, the most coincidental alphanumeric symbol 
represented as Fig.5 will be shown on display 50. Of course, for 
representing the handwriting feature of alphanumeric symbol as Fig.5, 
user may only handwriting input a part of radical such as the left side 
filled shape shown on Fig.6, or only handwriting input a part of the 
alphanumeric symbol distinguishing from the other candidate 
alphanumeric symbol, in this way, the system can also process the 
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pattern recognition and extract the object alphanumeric symbol. 

In the same way, another example is represented that the user 
wants to input a word as Fig.7. If user do a vocal input [fen \ ], then 
the system will generate a alphanumeric symbol array including 
candidate alphanumeric symbols corresponding to the vocal input, 
and user only have to do a handwriting input such as the radical 
( shown at upper side as filled type ) of the character of Fig.7, then as 
Fig.7 shows, the character of Fig.7 will be extracted from the 
alphanumeric symbol array by pattern recognition technology. 

The other example is, when the user wants to input a word as 
Fig.8, first, if user do a vocal input [pau \ ], then the candidate 
alphanumeric symbols will be sorted according to usage rate and 
listed. As Fig.8 shows, if user handwriting input a radical ( shown at 
left side as filled type ) , then the alphanumeric symbol of Fig.8 
including the radical will be selected by the second similarity 
estimator 8. Of course, if user do a handwriting input such as the other 
different radical, then the different alphanumeric symbol including the 
different radical will be selected. From the above examples, it can be 
clearly understood that the present invention effectively utilizes both 
the vocal and handwriting characteristic of Chinese. It is convenient 
for user to do only vocal input and a part of handwriting input then 
can recognize and input the character. 

On the other hand, as step 26 » 27 shows, if there is not the 
second input, then the present invention will be merely a vocal 
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recognition apparatus, it will extract the most frequently utilized 
character according to vocal input [fon / ] as well as the usage rate. 
Of course, in this situation, the recognition rate will not be promoted, 
unless the input alphanumeric symbol happens to be the most 
frequently utilized character. 

Besides, refer to Fig.3, the present invention can also be 
represented as step 201—203, if user does vocal input [fon/], then 
the most frequently utilized candidate alphanumeric symbol (the 
alphanumeric symbol of Fig.5 without the left side radical part) will 
be shown on the display 50. If user finds that the input object is an 
alphanumeric symbol as Fig.5 rather than the most frequently utilized 
candidate alphanumeric symbol, then user can process the second 
input (a radical shown at left side as filled type of Fig.5) in a 
predetermined time span. As step 204 * 205 shows, the present 
invention will extract the alphanumeric symbol as Fig.5 from the 
alphanumeric symbol array corresponding to the second input, and as 
step 206 shows, replace the most frequently utilized candidate 
alphanumeric symbol by alphanumeric symbol with a radical 
characteristic. 

On the basis of the characteristic of Chinese, even a plurality of 
alphanumeric symbols correspond to the same pronunciation, the 
discrimination between the alphanumeric symbols is quite obvious, 
taking Chinese character on Fig.4 for example, the radical and 
handwriting style for each alphanumeric symbol is quite different. 
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Therefore, by the complement between vocal input and handwriting 
input, user can easily and effectively promote the recognition rate by 
the combination of vocal input of an alphanumeric symbol and radical 
part of handwriting input, rather than completely handwriting input 
each complex alphanumeric symbol. Therefore, the present invention 
makes the input and recognition more efficient. 

Although the present invention has been described and 
illustrated in detail, it is to be clearly understood that the same is by 
the way of illustration and example only and is not to be taken by way 
of limitation, the spirit and scope of the present invention being 
limited only by the terms of the appended claims. 
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